First task:
do the exercise proposed in slide 59 (linear relationship with 45 deg rotation) and comment on the results.
Second task:
Option 1: synthetic sample
Prepare a program to generate samples of medium size (about 1000 objects) with three parameters (random variables) for each object, generated as follows
- x gaussian distribution N(0,sigmaX)
- y = ex+ gaussian noise N(0,sigmaErr)
- z gaussian distribution N(0,sigmaZ)
- small sigmaX and sigmaErr << sigmaX
- large sigmaX and sigmaErr << sigmaX
- small sigmaX and sigmaErr ~ sigmaX
* Next, define a set of alternative variables (x',y',z') as a functional combination of the original
ones, aiming to make the new variables more suitable for the PCA analysis (that is, aiming to
have linear relationships between them, see notes).
* Generate new files from the original samples using the new variables
* Do a new PCA with the new samples. Compare the results with the first PCA ones and
discuss the improvement.
Option 2: use a data sample from your field of work
* Document the sample as indicated in the notes
* Do a PCA analysis and discuss the results
* Discuss if a functional combination of the original variables can improve the PCA results.
* If so, do it and discuss the results.
In both cases please include in the delivery the data file used, in weka format
Disponible des de: | dimecres, 17 novembre 2010, 10:30 |
Data de venciment: | dimecres, 1 desembre 2010, 10:30 |